Skip to content

[AArch64][SME] Fix iterator to fixupCalleeSaveRestoreStackOffset #110855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

sdesmalen-arm
Copy link
Collaborator

The iterator passed to fixupCalleeSaveRestoreStackOffset may be incorrect when it tries to skip over the instructions that get the current value of 'vg', when there is a 'rdsvl' instruction straight after the prologue. That's because it doesn't check that the instruction is still a 'frame-setup' instruction.

The iterator passed to fixupCalleeSaveRestoreStackOffset may be
incorrect when it tries to skip over the instructions that
get the current value of 'vg', when there is a 'rdsvl' instruction
straight after the prologue. That's because it doesn't check that
the instruction is still a 'frame-setup' instruction.
@llvmbot
Copy link
Member

llvmbot commented Oct 2, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Sander de Smalen (sdesmalen-arm)

Changes

The iterator passed to fixupCalleeSaveRestoreStackOffset may be incorrect when it tries to skip over the instructions that get the current value of 'vg', when there is a 'rdsvl' instruction straight after the prologue. That's because it doesn't check that the instruction is still a 'frame-setup' instruction.


Full diff: https://github.com/llvm/llvm-project/pull/110855.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+3-6)
  • (modified) llvm/test/CodeGen/AArch64/sme-vg-to-stack.ll (+38)
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index fde07d84e97f58..9179da9cd1d814 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -1948,12 +1948,9 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF,
   // pointer bump above.
   while (MBBI != End && MBBI->getFlag(MachineInstr::FrameSetup) &&
          !IsSVECalleeSave(MBBI)) {
-    // Move past instructions generated to calculate VG
-    if (requiresSaveVG(MF))
-      while (isVGInstruction(MBBI))
-        ++MBBI;
-
-    if (CombineSPBump)
+    if (CombineSPBump &&
+        // Only fix-up frame-setup load/store instructions.
+        (!requiresSaveVG(MF) || !isVGInstruction(MBBI)))
       fixupCalleeSaveRestoreStackOffset(*MBBI, AFI->getLocalStackSize(),
                                         NeedsWinCFI, &HasWinCFI);
     ++MBBI;
diff --git a/llvm/test/CodeGen/AArch64/sme-vg-to-stack.ll b/llvm/test/CodeGen/AArch64/sme-vg-to-stack.ll
index 8724e7c1c368d9..5adeef3ab74552 100644
--- a/llvm/test/CodeGen/AArch64/sme-vg-to-stack.ll
+++ b/llvm/test/CodeGen/AArch64/sme-vg-to-stack.ll
@@ -1066,6 +1066,44 @@ define void @streaming_compatible_no_sve(i32 noundef %x) #4 {
   ret void
 }
 
+; The algorithm that fixes up the offsets of the callee-save/restore
+; instructions must jump over the instructions that instantiate the current
+; 'VG' value. We must make sure that it doesn't consider any RDSVL in
+; user-code as if it is part of the frame-setup when doing so.
+define void @test_rdsvl_right_after_prologue(i64 %x0) nounwind {
+; NO-SVE-CHECK-LABEL: test_rdsvl_right_after_prologue:
+; NO-SVE-CHECK:     // %bb.0:
+; NO-SVE-CHECK-NEXT: stp     d15, d14, [sp, #-96]!           // 16-byte Folded Spill
+; NO-SVE-CHECK-NEXT: stp     d13, d12, [sp, #16]             // 16-byte Folded Spill
+; NO-SVE-CHECK-NEXT: mov     x9, x0
+; NO-SVE-CHECK-NEXT: stp     d11, d10, [sp, #32]             // 16-byte Folded Spill
+; NO-SVE-CHECK-NEXT: stp     d9, d8, [sp, #48]               // 16-byte Folded Spill
+; NO-SVE-CHECK-NEXT: stp     x29, x30, [sp, #64]             // 16-byte Folded Spill
+; NO-SVE-CHECK-NEXT: bl      __arm_get_current_vg
+; NO-SVE-CHECK-NEXT: str     x0, [sp, #80]                   // 8-byte Folded Spill
+; NO-SVE-CHECK-NEXT: mov     x0, x9
+; NO-SVE-CHECK-NEXT: rdsvl   x8, #1
+; NO-SVE-CHECK-NEXT: add     x29, sp, #64
+; NO-SVE-CHECK-NEXT: lsr     x8, x8, #3
+; NO-SVE-CHECK-NEXT: mov     x1, x0
+; NO-SVE-CHECK-NEXT: smstart sm
+; NO-SVE-CHECK-NEXT: mov     x0, x8
+; NO-SVE-CHECK-NEXT: bl      bar
+; NO-SVE-CHECK-NEXT: smstop  sm
+; NO-SVE-CHECK-NEXT: ldp     x29, x30, [sp, #64]             // 16-byte Folded Reload
+; NO-SVE-CHECK-NEXT: ldp     d9, d8, [sp, #48]               // 16-byte Folded Reload
+; NO-SVE-CHECK-NEXT: ldp     d11, d10, [sp, #32]             // 16-byte Folded Reload
+; NO-SVE-CHECK-NEXT: ldp     d13, d12, [sp, #16]             // 16-byte Folded Reload
+; NO-SVE-CHECK-NEXT: ldp     d15, d14, [sp], #96             // 16-byte Folded Reload
+; NO-SVE-CHECK-NEXT: ret
+  %some_alloc = alloca i64, align 8
+  %rdsvl = tail call i64 @llvm.aarch64.sme.cntsd()
+  call void @bar(i64 %rdsvl, i64 %x0) "aarch64_pstate_sm_enabled"
+  ret void
+}
+
+declare void @bar(i64, i64)
+
 ; Ensure we still emit async unwind information with -fno-asynchronous-unwind-tables
 ; if the function contains a streaming-mode change.
 

Copy link
Contributor

@kmclaughlin-arm kmclaughlin-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this, LGTM!

@sdesmalen-arm sdesmalen-arm merged commit f314e12 into llvm:main Oct 15, 2024
10 checks passed
DanielCChen pushed a commit to DanielCChen/llvm-project that referenced this pull request Oct 16, 2024
…m#110855)

The iterator passed to `fixupCalleeSaveRestoreStackOffset` may be
incorrect when it tries to skip over the instructions that get the
current value of 'vg', when there is a 'rdsvl' instruction straight
after the prologue. That's because it doesn't check that the instruction
is still a 'frame-setup' instruction.
@sdesmalen-arm sdesmalen-arm added this to the LLVM 19.X Release milestone Nov 25, 2024
@sdesmalen-arm
Copy link
Collaborator Author

/cherry-pick f314e12

@llvmbot
Copy link
Member

llvmbot commented Nov 25, 2024

Failed to cherry-pick: f314e12

https://github.com/llvm/llvm-project/actions/runs/12018902390

Please manually backport the fix and push it to your github fork. Once this is done, please create a pull request

sdesmalen-arm added a commit to sdesmalen-arm/llvm-project that referenced this pull request Nov 26, 2024
…m#110855)

The iterator passed to `fixupCalleeSaveRestoreStackOffset` may be
incorrect when it tries to skip over the instructions that get the
current value of 'vg', when there is a 'rdsvl' instruction straight
after the prologue. That's because it doesn't check that the instruction
is still a 'frame-setup' instruction.
sdesmalen-arm added a commit to sdesmalen-arm/llvm-project that referenced this pull request Nov 26, 2024
…m#110855)

The iterator passed to `fixupCalleeSaveRestoreStackOffset` may be
incorrect when it tries to skip over the instructions that get the
current value of 'vg', when there is a 'rdsvl' instruction straight
after the prologue. That's because it doesn't check that the instruction
is still a 'frame-setup' instruction.
tru pushed a commit to sdesmalen-arm/llvm-project that referenced this pull request Dec 3, 2024
…m#110855)

The iterator passed to `fixupCalleeSaveRestoreStackOffset` may be
incorrect when it tries to skip over the instructions that get the
current value of 'vg', when there is a 'rdsvl' instruction straight
after the prologue. That's because it doesn't check that the instruction
is still a 'frame-setup' instruction.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

3 participants